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Innovation and technology have subsequently transformed banking 
industry’s way of delivering products and services to their customer. Mobile 
banking is an effective way of performing transaction as it can be performed 
anywhere and anytime. The evolution of banking experience is important to 
fulfil customers’ need and demand especially in highly competitive banking 
industry. Through mobile banking application, customer can express their 
satisfaction and dissatisfaction directly on the application store platform. The 
fulfilment of customer’s satisfaction is important to avoid customer attrition. 
This research focused on customer feedbacks towards six mobile banking 
application in Malaysia which is Maybank, Commerce International 
Merchant Bankers (CIMB), Public Bank, Hong Leong Bank, Rashid 
Hussein Bank (RHB) and AmBank. This research aims to identify keywords 
related to customer feedback towards mobile banking, classify the sentiment 
and evaluate the accuracy performance by using supervised machine 


learning algorithm of support vector machine (SVM) and naive Bayes (NB). 
The result shows that linear SVM is the best model with the highest value in 
all accuracy, precision, recall, including Fl-score with value 97.17%, 
97.21%, 97.17% and 97.18% respectively. With this high accuracy value, 
this model would have better performance in analyzing the classification of 
customer feedback in mobile banking application. 
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1. INTRODUCTION 

A customer can be defined as a receiver, consumer or buyer of a company’s products and services. 
Opinions and feedbacks given by customer after experiencing services and products is beneficial to 
accurately regulate business operation that fits the customers’ need. In this era of internet and digital world, 
there are numbers of knowledge and information readily available at the end of your fingertips. The 
advancement of technology forces banking industry to move towards using mobile banking. The 
transformation on banking transaction from paper-based to an electronic payment is an example to see how 
banking industry has evolved. In banking industry, the prospect of revenue growth and operational efficiency 
is essential for all business in order to stay relevant and survive in the industry. High competition between 
banks is one of the factors that leads to this migration. 

Nowadays, customer prefer to use mobile phone in doing all activities including on getting services. 
One of the reasons is due to unlimited access on owning a mobile phone regardless of social status. Internet 
network services that readily available at a low cost has subsequently contribute to the rise of mobile phone 
users. This is where the needs on having mobile banking application emerged. Mobile banking defines as a 
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capability on performing financial transaction via mobile device [1]. Thus, customers can have access to their 
respective bank account and perform the transaction anytime and anywhere without the need on going to the 
physical bank. 

In Malaysia, there are numbers of mobile banking application available namely Maybank My, 
Commerce International Merchant Bankers (CIMB) Clicks, PB Engage, Rashid Hussein Bank (RHB) Now 
and many more. With wide range of feedbacks and reviews available, study on sentiment analysis is 
important for a company to classify those comments for products and services improvement. In this current 
competitive world, building a strong relationship with the customer has become an important strategy. Banks 
must battle to provide the best in customer satisfaction by introducing innovative strategies [2], [3] stated that 
a business spends a huge amount of money and time on brand monitoring and gathering real-time customer 
feedback. Dissatisfaction on services provided would affect business reputation and lead to customer 
attrition, thus, consequently impact the business’s revenue. Customer attrition is an important issue faced not 
just in banking industry but also in insurance company, mobile service provider and many more [4]. Prompt 
action should be taken to retain existing customer and attract potential customer to perform business with the 
bank. An implementation of analytics is undeniably important for business if it can be rationally applied and 
identified in monitoring the customer behaviour towards business operation [5]. 

The aim of this research is to classify customers’ feedback made on mobile banking application by 
using ML techniques. Three objectives are defined which firstly, is to classify the customer’s feedback based 
on sentiment polarity score. Secondly to identify keywords related with customer’s feedback towards mobile 
banking experience and lastly to evaluate the performance of sentiment analysis classifiers by using 
performance matrix. For this research, classification of large scale customer feedback towards mobile 
banking application in Malaysia is focused. The data of customer’s feedback towards mobile application will 
be extracted based on six (6) banking institutions in Malaysia based on the review posted via their official 
mobile banking application platform. ML algorithms of naive Bayes (NB) and support vector machine 
(SVM) are used to predict the accuracy of sentiment classifications. For NB, the kernel used is multinomial 
naive Bayes (MNB) and bernoulli naive Bayes (BNB) while for SVM, the kernel used is linear support 
vector machine (LSVM). The findings of this research are significant as it provides insight for banking 
institution on users’ reaction towards their mobile banking applications to fulfil the customer experience. 
According to [2], it is vital for banks to collect customer feedback from various banking services. With the 
results from this research, an improvement could be made to come up with an application that suits 
customer’s needs by enhancing the related aspects to compete with other mobile application in banking 
industry. Other than that, this research also useful in providing information for public user on the 
performance of mobile banking application available. A view on service provided through this mobile 
banking could be a reason for a banking industry to attract a customer to perform business with them. 
Furthermore, a good mobile banking platform would be a factor to retain customer in savings and performed 
financial transaction thus subsequently will gain a strong bond between banking institution and customer. 

Research on sentiment analysis of customer’s feedback and review has been widely covered on 
various field by researchers in the past. Hasan et al. [6] conducted a ML based sentiment analysis focusing on 
user tweets about politics in Pakistan. The tweets were written in Urdu and translated to English. In this 
research, the accuracy of NB and SVM is compared. The result shows word sequence disambiguation 
(W-WSD) had the highest accuracy in NB classifier with the percentage of 79.00%, followed by 76.00% and 
54.75% for TextBlob and SentiWordNet. In SVM, TextBlob had the highest accuracy in comparison with W- 
WSD and SentiWordNet with the percentage of 62.67%, 62.33% and 53.33% respectively. 

Lien [7] analysed review from bank customers in Norway. The author defined the polarity 
proportion based on 1-5 stars rating review given by the bank customers. The reviews are gathered from three 
sources which are the bank’s review sites, social media and discussion forums. ML algorithm of gaussian 
naive Bayes (GNB), LSVM and maximum entropy (ME) in used where the models is conducted with 5-fold 
cross validation. The result shows that ME result in highest accuracy. Rana and Singh [8] used LSVM and 
NB to identify film user reviews and detect opinion. The accuracy of algorithm is process on each movie 
genre namely action, adventure, drama and romantic where from the experiment, LSVM shows a higher 
accuracy compared to NB in all genres mention with the highest accuracy in drama type of movie. 
Furthermore, Ayo et al. [9] used RapidMiner to combine tweets and comments on Facebook from five major 
banks in Nigeria. The result is divided into two parts which is by using sentiment analysis and clustering 
analysis. The sentiment analysis was made to compare the most negative value (MNV) and most positive 
value (MPV) on tweets and comments between banks. Kumar and Dabas [10] proposed social media 
complaint workflow automation tool that use sentiment analysis on social media to actively respond on 
complaints. Based on the study, three variants of NB classified are used consist of MNB, BNB, GNB and 
SVM. The classifiers are implemented to see the performance against social media post of HDFC Bank India. 
The results show that approximately 83% and 75% of accuracy achieved by MNB cassifier in the analysis of 
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sentiment classification and department classification respectively. Altrabsheh et al. [11] analysed a real-time 
student feedback with sentiment analysis. Data regarding student’s feedback, opinion and feeling on lecture 
session were collected. The technique used in the experiments are NB, complement naive Bayes (CNB), ME 
and three types of SVM kernels namely linear, radial basis and polynomial. The result shows that SVM has 
high accuracy in precision, recall and F-score. Shi and Li [12] used sentiment analysis model for online hotel 
reviews. The study used SVM technique for the sentiment classification with unigram feature consist of 
information on frequency and term-frequency inverse document frequency (TF-IDF). The result shows that 
TF-IDF is more efficient. Go et al. [13] used ML techniques to correctly categorize Twitter post as either 
positive or negative. The supervised classifiers used are NB, ME and SVM where NB have the highest result 
of 84.2% compared to the other two classifiers. 

From the literature, SVM and NB are the common techniques used in research of sentiment 
analysis. This section summarizes the frequently used techniques for sentiment analysis of customer 
feedback. With this motivation, this paper has proposed to use NB and SVM method and compare the results 
to find the best model for sentiment analysis of customer feedback in mobile banking application. 


2. RESEARCH METHOD 

Based on cross-industry standard process for data mining (CRISP-DM) method [14]. This research 
proposed a research model as shown in Figure 1. The processes involve are data collection, data pre- 
processing, model development, model evaluation and lastly results deployment. 


Data 
Preprocessing 


* Demojization 
* Noise Removal 
* Text Normalization 

* Data Translation 

* Tokenization 

* Stopword Elimination 
* Lemmatization 

* Feature Selection 


Model 
Evaluation 


* Naive Bayes 
* Support Vector Machine 


* Sentiment Detection 


Model 
Development 


* Visualization 


Results 
Deployment 


* Google Playstore 
* WebHarvy Tool 


Data Collection 


Figure 1. Proposed model 


In the first process, data of customer’s feedback towards mobile banking application will be 
collected from Google Playstore for Maybank, CIMB, Public Bank, RHB Bank, Hong Leong Bank and AM 
Bank. The reviews were extracted on 7th March 2020 focusing on first 100 pages in user review section. The 
author of the reviews are bank customers and the reviews were written in mix languages consist of Malay, 
Chinese and English language. Three attributes are used namely ‘Rating’ which contains the rate given by 
user, ‘Descriptions’ which contains the user’s review on the application and ‘Bank Type’ which is type of 
bank for each review. 

Pre-processing is a method to improve sentiment analysis by cleaning the data from undesirable 
elements to increase the accurateness thus lessen the existence of error in processing the outcomes. The user 
reviews consist of great amount of vague information that need to be eliminated. Shekhawat [15] stated that 
data cleaning process is important to compute the sentiment score so that machine will easily understand the 
text. Pre-processing involves demojization of transforming emojis to the textual equivalent form [16], [17], 
noise removal for text normalization [18]. The review captured is in the form of multiple languages such as 
Malay, Chinese and English. Since Malaysia is a multi-racial country, thus it is normal for the people to give 
reviews and feedbacks in their preferred language. For this research, translation is applied to translate the 
languages into English language. Desai and Narvekar [19] stated that, spelling errors is produced 
unintentionally due to human errors. The spelling corrections process is important to avoid the system from 
ignoring important words in the reviews. Tokenization is used to divide the text of documents into separate 
series of words or sequence of tokens [20], [21]. Stopword elimination is implemented to enhance the system 
performance despite reducing the number of texts [22], [23]. Lemmatization is the technique to reduce 
related word to common root word form [8], [22]. The example of lemmatization can be seen in a word 
variation like “feature”, “featuring”, “features” and “featured” where these words is belonged to root word 
“feature”. 
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Feature in language processing refers to the textual data that is converted to numeric vector [24]. In 
data extraction phase, a bag of word needs to be converted in vector model. According to [18] there are three 
ways for converting terms into vector namely term frequency, term occurrences and TF-IDF. TF-IDF is a 
weighing factor that can be used to replace binary and word count representation [25]. Conversion of term to 
vector produce a lower weightage to irrelevant terms while a higher weightage to the relevant terms with 
vector value of 0 and 1 respectively. In this research, TF-IDF is used as a weighting scheme to create the 
word vector [26]-[28]. 

In model development, clean dataset is used to detect the sentiment of customers’ review either 
positive, neutral or negative sentiment. To provide valuable insight from the emotion and opinion stated, 
Textblob sentiment analyser will be used to determine the polarity and subjectivity score of each review. The 
polarity score [29] is determined by assigning a score from -1 to | based on the words used where a negative 
score represents a negative statement, a positive score represents positive statement while zero value 
indicates a neutral statement. On the other hand, subjectivity score is determined to know either the context 
of the review is in subjective meaning or objective meaning. It is based on range value of 0 to 1, the closer 
the score to 1 the more subjective the text is.The score that are more than 0 will be classified as positive 
sentiment, score which are less that 0 is negative sentiment and equal to zero as neutral sentiment. The 
polarity score is determined based on reviews given by user under “Descriptions” column. According to [30], 
in a survey made on user review posted in Google Play Store, user review act as an important source of 
knowledge for developers as it provides wide information in terms of issues and improvement can be made 
on the application. Noei [31] stated among the important pieces of information hidden in the user reviews are 
user’s expectations and concerns, feature requests, bug reports, and guidelines planning for a future release. 

Sentiment analysis consist of three polarity classes, which are positive, negative and neutral The 
sentiment polarity is set based on user review instead of rating score due to unclear definition and 
inconsistent personal interpretation of star rating. The star rating score given might be differ from the reviews 
stated. As the polarity score has been determined, the sentiment datasets are then evaluated. The evaluation 
will be based on the concept of confusion matrix. The classifiers used are SVM and NB. For NB, the kernel 
used is MNB and also BNB while for SVM, the kernel used is LSVM. The polarity of sentiment 
classification is in three-class classification, the confusion matrix is extended as shown in Table 1. 


Table 1. Confusion matrix of polarity sentiment classification 
Predicted Positive Predicted Neutral Predicted Negative 
Actual Positive True Positive (T,,) False Neutral (Fy,,) False Negative (F,,) 
Actual Neutral False Positive (F,,,) | True Neutral (Ty,,) False Negative (F,,,) 
Actual Negative False Positive (f,,) False Neutral (Fy,,) True Negative (T,,,,) 


3. RESULTS AND DISCUSSION 

This section discusses on the results and findings of the experiments. 46% of the reviews belongs to 
CIMB banking application with the number of 22,903 reviews, followed by 34% (17,028 reviews) from 
Maybank application, 9% (4,752 reviews) from Hong Leong Bank application, 5% (2,618 reviews) from AM 
Bank application, 4% (1,787 reviews) and 2% (1,082 reviews) for RHB Bank and Public Bank applications 
respectively.As shown in Figure 2, even though the number of reviews is top by CIMB application, we can 
see that most of the review given by user was |-star rating. On the other hand, for Maybank application most 
of the reviews given was 5-star rating. The star rating refers to a score that represents the user’s impression 
when using certain product and services. The number of stars given can be describe as their level of 
satisfaction on using these mobile banking applications. 

Most of score is between 0 to | meaning the sentiment will be more on positive sentiment compared 
to neutral and negative sentiment. Table 2 illustrates positive sentiment has the highest score of 27,718. 
Sentiment count made by each mobile banking applications shows a higher positive sentiment especially in 
Maybank, CIMB and Hong Leong Bank applications with the positive value that almost twice the value of 
negative and neutral sentiments. 

Wordcloud is used to find a keyword related with mobile banking applications. The word clouds 
consist of the list of words related to negative and positive sentiment classifications. Figure 3 shows the list 
of words related to positive sentiment where we can see that the highest frequency of words generated were 
“good”, “new”, “easy”, and “problem”. From this list, it can be assumed that user feel satisfied when the 
application is updated, user friendly and able to solve their problem when doing transactions. On the other 
hand, Figure 4 shows the list of words related to negative sentiment where we can see that the highest 
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frequency of words generated are “stupid”, “bad”, “slow”, “worst” and “useless”. From the list generated, it 
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can be assumed that the user is having an issue when trying to log in to their account. It is also can be said 


that the negative sentiments are closely related to the performance of the application in performing the online 
transaction. 


Rating Given by User for Each Banking Application 
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Figure 2. Rating given by user for each banking application 


Table 2. Sentiments by bank type 


Sentiments 
cane td Positive Neutral Negative etal 
CIMB 10,568 5,945 6,390 22,903 
Maybank 11,488 3,659 1,881 17,028 
Hong Leong Bank 2,860 999 893 4,752 
AM Bank 1,399 582 637 2,618 
RHB Bank 941 362 484 1,787 
Public Bank 462 302 318 1,082 
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In this section, the accuracy, precision, recall and Fl score is compared. Table 3 shows the 
confusion matrix for MNB. It shows that out of 5,017 total reviews there are 846 negative sentiments 
predicted correctly, making the result of accuracy for MNB classifier become 83.16%. Figure 5 shows the 
result of precision, recall and F-measure in each sentiment classes by using weighted average. Negative and 
neutral has the highest recall value. Meaning it has a higher negative and neutral sentiment that were 
correctly predicted. On the hand for positive sentiment, it shows a high precision where the value is defined 
as the proportion of texts that are correctly predicted over total prediction of positive texts. 
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Figure 5. Precision, recall and F1 score values for MNB classifier 


Table 3. Confusion matrix for MNB classifier 


Predicted 
Negative Neutral _ Positive 
Negative 846 41 48 
Actual Neutral 41 631 28 
Positive 174 513 2,695 


Table 4 shows the confusion matrix for BNB, the confusion matrix shows that out of 5,017 total 
reviews there are 729 negative sentiments predicted correctly, 947 neutral sentiments predicted correctly, and 
2,577 positive sentiments has been predicted correctly. BNB shows the accuracy of 84.77% while for 
precision, recall and Fl-score result, it is as per shown in Figure 6. Thus, for BNB it also has a higher 
negative and neutral sentiment that were correctly predicted over the actual amount. Positive sentiment 
shows highest precision percentage. 
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Figure 6. Precision, recall and F1 score values for BNB classifier 
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Table 4. Confusion matrix for BNB classifier 


Predicted 
Negative Neutral Positive 
Actual Negative 729 71 178 
Neutral 52 947 16 
Positive 280 167 2,577 


Furthermore, the confusion matrix for LSVM is shown in Table 5, the confusion matrix shows that 
out of 5,017 total reviews there are 988 negative sentiment predicted correctly, 1,164 neutral sentiment 
predicted correctly, and 2,723 positive predicted correctly making accuracy of 97.17% for LSVM classifier. 
Figure 7 shows the result of precision, recall and F-measure in each sentiment classes. LSVM shows a 
slightly different result compared to both NB classifier where negative sentiment has a high recall value. On 
the other hand, neutral and positive sentiment shows a high precision value with the percentage of 98.23% 
and 98.27% respectively.The result shows that LSVM is the best technique with the highest value in all 
accuracy, precision, recall, including the Fl-score to predict sentiment of customer feedback in mobile 
banking application. For the result obtained above, the positive and negative word clouds indicate the most 
frequent words that appear in the feedbacks given by customer. 
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Figure 7. Precision, recall and F1 score values for LSVM classifier 


Table 5. Confusion matrix for LSVM classifier 


Predicted 
Negative Neutral Positive 
Actual Negative 988 9 26 
Neutral 29 1,164 22 
Positive 44 12 2,123 


4. CONCLUSION 

This research has potential to be improved in sentiment analysis study. In order to get more valuable 
information in mobile banking application review, some of the features may be implemented to get more 
comprehensive solution in determining the sentiments of user review. As the current research is considering 
only the reviews given by user, hybrid study by analysing the sentiment behind both user review and rating 
score given can be done by using mean of the stared rating and numeric rating generated in polarity score. 
This to provide valuable insight from the emotion and opinion written by user. Future work should evaluate 
the performance by using deep learning models such as convolutional neural network (CNN), recurrent 
neural network (RNN) and long short term memory (LSTM). Deep learning model is worth to explore as it 
provides deeper analysis of sentiment and would have more accurate sentiment detection. 
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